Harmony Assumptions in Information Retrieval and Social Networks

نویسندگان

  • Thomas Roelleke
  • Andreas Kaltenbrunner
  • Ricardo A. Baeza-Yates
چکیده

In many applications, independence of event occurrences is assumed, even if there is evidence for dependence. Capturing dependence leads to complex models, and even if the complex models were superior, they fail to beat the simplicity and scalability of the independence assumption. Therefore, many models assume independence and apply heuristics to improve results. Theoretical explanations of the heuristics are seldom given or generalizable. This paper reports that some of these heuristics can be explained as encoding dependence in an exponent based on the generalized harmonic sum. Unlike independence, where the probability of subsequent occurrences of an event is the product of the single event probability, harmony is based on a product with decaying exponent. For independence, the sequence probability is p1+1+···+1 = pn, whereas for harmony, it is p1+1/2+···+1/n. The generalized harmonic sum leads to a spectrum of harmony assumptions. This paper shows that harmony assumptions naturally extend probability theory. An experimental evaluation for information retrieval (IR; term occurrences) and social networks (SN’s; user interactions) shows that assuming harmony is more suitable than assuming independence. The potential impact of harmony assumptions lies beyond IR and SN’s, since many applications rely on probability theory and apply heuristics to compensate the independence assumption. Given the concept of harmony assumptions, the dependence between multiple occurrences of an event can be reflected in an intuitive and effective way.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Health Information Seeking Behavior of Graduate Students Linked to Corona Virus at Qom University

Objective: Health information on diseases could help prevent the spread and the treatment and is the most vital needs of people in daily life. One health issue that has plagued the world in recent years is the corona virus. Therefore, the main purpose of this study was to investigate the health information behavior of graduate students at Qom University. Methodology: Applied descriptive survey...

متن کامل

A Knowledge Management Approach to Discovering Influential Users in Social Media

A key step for success of marketer is to discover influential users who diffuse information and their followers have interest to this information and increase to diffuse information on social media. They can reduce the cost of advertising, increase sales and maximize diffusion of information.  A key problem is how to precisely identify the most influential users on social networks. In this pape...

متن کامل

Intellectual Structure of Knowledge in Information Behavior: A Co-Word Analysis

Background and Aim: The intellectual structure of knowledge and its research front can be identified by co-word analysis. This research attempts to reveal the intellectual structure of knowledge in information behavior inquiries, via co-word, network analysis, and science visualization tools. Methods: Bibliometric methodology and social network analysis are used. Population comprises 2146 recor...

متن کامل

Investigating the Impact of Authors’ Rank in Bibliographic Networks on Expertise Retrieval

Background and Aim: this research investigates the impact of authors’ rank in Bibliographic networks on document-centered model of Expertise Retrieval. Its purpose is to find out what kind of authors’ ranking in bibliographic networks can improve the performance of document-centered model.   Methodology: Current research is an experimental one. To operationalize research goals, a new test colle...

متن کامل

Combining Harmony search algorithm and Ant Colony Optimization algorithm to increase the lifetime of Wireless Sensor Networks

Wireless Sensor Networks are the new generation of networks that typically are formed great numbers of nodes and the communications of these nodes are done as Wireless. The main goal of these networks is collecting data from neighboring environment of network sensors. Since the sensor nodes are battery operated and there is no possibility of charging or replacing the batteries, the lifetime of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Comput. J.

دوره 58  شماره 

صفحات  -

تاریخ انتشار 2015